Evaluating entity resolution results
نویسندگان
چکیده
منابع مشابه
Evaluating Entity Resolution Results
Entity Resolution (ER) is the process of identifying groups of records that refer to the same real-world entity. Various measures (e.g., pairwise F1, cluster F1) have been used for evaluating ER results. However, ER measures tend to be chosen in an ad-hoc fashion without careful thought as to what defines a good result for the specific application at hand. In this paper, our contributions are t...
متن کاملEvaluating Entity Resolution Results (Extended version)
Entity Resolution (ER) is the process of identifying groups of records that refer to the same real-world entity. Various measures (e.g., pairwise F1, cluster F1) have been used for evaluating ER results. However, ER measures tend to be chosen in an ad-hoc fashion without careful thought as to what defines a good result for the specific application at hand. In this paper, our contributions are t...
متن کاملA Practioner's Guide to Evaluating Entity Resolution Results
Entity resolution (ER) is the task of identifying records belonging to the same entity (e.g. individual, group) across one or multiple databases. Ironically, it has multiple names: deduplication and record linkage, among others. In this paper we survey metrics used to evaluate ER results in order to iteratively improve performance and guarantee sufficient quality prior to deployment. Some of th...
متن کاملA Dynamic Indexing for Incremental Entity Resolution over Query Results
Entity Resolution (ER) is the problem of identifying groups of tuples from one or multiple data sources that represent the same real-world entity. This is a crucial stage of data integration processes, which often need to integrate data at query time. This task becomes more challenging in scenarios with dynamic data sources or with a large volume of data. As most ER techniques deal with all tup...
متن کاملUnsupervised Named Entity Resolution
Resolving the ambiguity of person, organisation and location names is a challenging problem in the Natural Language Processing (NLP) area. This problem is usually formulated as a clustering problem, in which the target is to group mentions of the same entity into the same cluster. In this paper, we present a different approach based on the Distributional Hypothesis and edit distance, which asso...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2010
ISSN: 2150-8097
DOI: 10.14778/1920841.1920871